Supporting Fault-Tolerant Parallel Programming in Linda1

نویسندگان

  • David E. Bakken
  • Richard D. Schlichting
چکیده

Linda is a language for programming parallel applications whose most notable feature is a distributed shared memory called tuple space. While suitable for a wide variety of programs, one shortcoming of the language as commonly defined and implemented is a lack of support for writing programs that can tolerate failures in the underlying computing platform. This paper describes FT-Linda, a version of Linda that addresses this problem by providing two major enhancements that facilitate the writing of fault-tolerant applications: stable tuple spaces and atomic execution of tuple space operations. The former is a type of stable storage in which tuple values are guaranteed to persist across failures, while the latter allows collections of tuple operations to be executed in an all-or-nothing fashion despite failures and concurrency. The design of these enhancements is presented in detail and illustrated by examples drawn from both the Linda and fault-tolerance domains. An implementation of FT-Linda for a network of workstations is also described. The design is based on replicating the contents of stable tuple spaces to provide failure resilience and then updating the copies using atomic multicast. This strategy allows an efficient implementation in which only a single multicast message is needed for each atomic collection of tuple space operations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault Tolerance Lessons Applied to Parallel Computing

This paper describes an approach to fault-tolerant parallel computing which is based on the experiences with the most successful fault-tolerant software – the transaction processing systems. The algorithms presented here have less runtime overhead and faster recovery than most preceding approaches. In the Pact parallel programming environment fault tolerance is provided fully user transparent i...

متن کامل

Algorithm-based fault-tolerant programming in scientific computation on multiprocessors

EEcient parallel algorithms proposed to solve many fundamental problems in scientiic computation are sensitive to processor failures. Because of its low costs, algorithm-based fault tolerance i s a n i n t e r esting concept for introducing fault tolerance into existing multi-processors. To facilitate fault{tolerant programming in scientiic computation, we have modiied and developed further an ...

متن کامل

A Framework for Performing Fault-Tolerant Placement Based on Genetic Algorithm

Fault-tolerance is a crucial challenge for a number of application domains. Existing solutions to this problem are applied uniformly at the entire design imposing among others mentionable delay and power overheads. In this paper we introduce a software-supported framework based on genetic algorithm for supporting fast application placement under fault-tolerant constraints. Rather than relevant ...

متن کامل

NASA Contractor Report 181938 Investigation of the Applicability of a Functional Programming Model to Fault Tolerant Parallel Processing for Knowledge-Based Systems

In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checlq3ointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper Fault Tolerant Parallel Processor (FTPP). When used in conjunction with the FrPP's fault detection and masking capabilities, this implementation results in a graceful degradation...

متن کامل

Rapid Prototyping Of Parallel Fault Tolerant Systems

The design of a fault tolerant system would be enhanced if a rapid prototyping approach could be conceived which allows fault tolerance to be modelled quickly, and faults simulated to analyse the behaviour of the system. This paper concentrates on how a functional specification language, and its environment, can be used to describe and investigate the properties of parallel fault tolerant syste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1994